Add `ard_categorical_max()` #244

edelarua · 2024-11-25T23:27:02Z

What changes are proposed in this pull request?

Added function ard_categorical_max() to calculate categorical occurrence rates by maximum level per unique ID. (Add ARD function for counting patients #240)

Closes #240

Pre-review Checklist (if item does not apply, mark is as complete)

All GitHub Action workflows pass with a ✅
PR branch has pulled the most recent updates from master branch: usethis::pr_merge_main()
If a bug was fixed, a unit test was added.
If a new ard_*() function was added, it passes the ARD structural checks from cards::check_ard_structure().
If a new ard_*() function was added, set_cli_abort_call() has been set.
If a new ard_*() function was added and it depends on another package (such as, broom), is_pkg_installed("broom") has been set in the function call and the following added to the roxygen comments: @examplesIf do.call(asNamespace("cardx")$is_pkg_installed, list(pkg = "broom""))
Code coverage is suitable for any new functions/features (generally, 100% coverage for new code): devtools::test_coverage()

Reviewer Checklist (if item does not apply, mark is as complete)

If a bug was fixed, a unit test was added.
Code coverage is suitable for any new functions/features: devtools::test_coverage()

When the branch is ready to be merged:

Update NEWS.md with the changes from this pull request under the heading "# cardx (development version)". If there is an issue associated with the pull request, reference it in parentheses at the end update (see NEWS.md for examples).
All GitHub Action workflows pass with a ✅
Approve Pull Request
Merge the PR. Please use "Squash and merge" or "Rebase and merge".

github-actions · 2024-11-25T23:32:19Z

Unit Tests Summary

1 files 169 suites 1m 15s ⏱️
167 tests 167 ✅ 0 💤 0 ❌
695 runs 695 ✅ 0 💤 0 ❌

Results for commit 2f14d54.

♻️ This comment has been updated with latest results.

github-actions · 2024-11-25T23:32:22Z

Unit Test Performance Difference

Test Suite	$Status$	Time on `main`	$±Time$	$±Tests$	$±Skipped$	$±Failures$	$±Errors$
ard_categorical_max	👶		$+0.00$	$+1$	$0$	$0$	$0$

Additional test case details

Test Suite	$Status$	Time on `main`	$±Time$	Test Case
ard_categorical.survey.design	💚	$15.43$	$-1.83$	ard_categorical.survey.design_works
ard_categorical_max	👶		$+0.00$	ard_categorical_max_errors_with_incomplete_factor_columns
ard_categorical_max	👶		$+0.00$	ard_categorical_max_follows_ard_structure
ard_categorical_max	👶		$+0.00$	ard_categorical_max_quiet_works
ard_categorical_max	👶		$+0.00$	ard_categorical_max_statistic_works
ard_categorical_max	👶		$+2.85$	ard_categorical_max_works_with_default_settings
ard_categorical_max	👶		$+0.01$	ard_categorical_max_works_with_pre_ordered_factor_variables
ard_categorical_max	👶		$+0.00$	ard_categorical_max_works_without_any_variables
ard_continuous.survey.design	💔	$16.25$	$+1.65$	unstratified_ard_continuous.survey.design_works

Results for commit aa39b27

♻️ This comment has been updated with latest results.

github-actions · 2024-11-25T23:47:40Z

Code Coverage Summary

Filename                                Stmts    Miss  Cover    Missing
------------------------------------  -------  ------  -------  -----------------------------------
R/add_total_n.survey.design.R              12       0  100.00%
R/ard_aod_wald_test.R                      77       8  89.61%   38-43, 93, 96
R/ard_attributes.survey.design.R            2       0  100.00%
R/ard_car_anova.R                          45       2  95.56%   62, 65
R/ard_car_vif.R                            62       1  98.39%   87
R/ard_categorical_ci.R                     96       1  98.96%   83
R/ard_categorical_ci.survey.design.R      129       1  99.22%   180
R/ard_categorical.survey.design.R         392       8  97.96%   77, 227-230, 274, 516, 530
R/ard_continuous_ci.R                      28       1  96.43%   38
R/ard_continuous_ci.survey.design.R       138       0  100.00%
R/ard_continuous.survey.design.R          274      14  94.89%   86, 177, 187, 338, 369-370, 418-426
R/ard_dichotomous.survey.design.R          73       3  95.89%   51, 156, 161
R/ard_effectsize_cohens_d.R               103       2  98.06%   69, 122
R/ard_effectsize_hedges_g.R                91       2  97.80%   68, 120
R/ard_emmeans_mean_difference.R            70       0  100.00%
R/ard_event_rates.R                        76      16  78.95%   72-75, 81, 113-116, 127-133
R/ard_missing.survey.design.R              79       1  98.73%   52
R/ard_regression_basic.R                   16       1  93.75%   46
R/ard_regression.R                         73       0  100.00%
R/ard_smd_smd.R                            69       5  92.75%   57, 83-86
R/ard_stats_anova.R                        95       0  100.00%
R/ard_stats_aov.R                          46       0  100.00%
R/ard_stats_chisq_test.R                   40       1  97.50%   39
R/ard_stats_fisher_test.R                  43       1  97.67%   42
R/ard_stats_kruskal_test.R                 36       1  97.22%   38
R/ard_stats_mcnemar_test.R                 80       2  97.50%   63, 106
R/ard_stats_mood_test.R                    49       1  97.96%   45
R/ard_stats_oneway_test.R                  39       0  100.00%
R/ard_stats_poisson_test.R                 76       1  98.68%   59
R/ard_stats_prop_test.R                    85       1  98.82%   43
R/ard_stats_t_test_onesample.R             41       0  100.00%
R/ard_stats_t_test.R                      112       2  98.21%   65, 111
R/ard_stats_wilcox_test_onesample.R        42       0  100.00%
R/ard_stats_wilcox_test.R                  99       2  97.98%   65, 117
R/ard_survey_svychisq.R                    38       1  97.37%   44
R/ard_survey_svyranktest.R                 54       1  98.15%   44
R/ard_survey_svyttest.R                    53       1  98.11%   42
R/ard_survival_survdiff.R                  89       0  100.00%
R/ard_survival_survfit_diff.R              76       0  100.00%
R/ard_survival_survfit.R                  197       5  97.46%   211-215
R/construction_helpers.R                  106      10  90.57%   160-175, 189, 248
R/proportion_ci.R                         195       1  99.49%   454
TOTAL                                    3596      97  97.30%

Diff against main

Filename               Stmts    Miss  Cover
-------------------  -------  ------  -------
R/ard_event_rates.R      +76     +16  +78.95%
TOTAL                    +76     +16  -0.40%

Results for commit: c023a1a

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

ddsjoberg · 2024-11-25T23:49:19Z

Thanks @edelarua ! Can you help me understand the differences between this and the hierarchical function? Also, when we refer to ordered, does that mean ordered factors?

edelarua · 2024-11-26T00:02:09Z

Thanks @edelarua ! Can you help me understand the differences between this and the hierarchical function? Also, when we refer to ordered, does that mean ordered factors?

This performs similarly to the hierarchical function but doesn't use hierarchies, so calculates event rates for flat tables. The ordered argument is used to count "ordered" variables by highest level (i.e. grade/severity) which the hierarchy function is not able to do unless there is more than one level in the hierarchy since it has to use a workaround that moves the analysis variable to by right now.

For example, the grade rows in this table could be calculated with ard_event_rates(variables = AESEV, ordered = TRUE) but not ard_hierarchical(): https://insightsengineering.github.io/tlg-catalog/stable/tables/adverse-events/aet01_aesi.html#output

Honestly, we could probably simplify the hierarchical functions by implementing this upstream in cards (or internally) but there are several common tables where this would be useful.

ddsjoberg · 2024-11-26T16:23:40Z

Hmmm, before we merge this in, can we brainstorm together a bit? Either today or next week?

ddsjoberg · 2024-11-26T16:53:49Z

Would another way to calculate these quantities be:

ADAE |> 
  dplyr::slice_max(AESEV, n = 1, with_ties = FALSE, by = USUBJID) |> 
  cards::ard_categorical(
    by = TRTA, 
    variables = AESEV,
    denominator = ADSL |> dplyr::select(USUBJID, TRTA = ARM)
  )

ddsjoberg

This is great, thank you!! Let's chat about the comments before you move forward making any changes.

R/ard_categorical_max.R

…40_ard_event_rates@main

ddsjoberg

This is getting close! I am confusing myself thinking about the dataset passed in denominator when it doesn't have all the by variables. I think it may result in incorrect values.... 🤷🏼

ddsjoberg · 2025-01-11T00:07:22Z

R/ard_categorical_max.R

+      call = get_cli_abort_call()
+    )
+  }
+  if (is_empty(denominator)) denominator <- data


since we're doing this, should we just make the denominator default value denominator = data in the function definition ? Then it will be more clear to users what the default is

ddsjoberg · 2025-01-11T00:09:07Z

R/ard_categorical_max.R

+    function(x) {
+      ard_categorical(
+        data = data |>
+          cards:::arrange_using_order(c(id, by, x)) |>


Unfortunately, CRAN won't allow the :::, so we'll have to copy the function into cardx. But please add a note that it's copied from cards and the reason we are using it (so we don't forget!)

ddsjoberg · 2025-01-11T00:14:26Z

R/ard_categorical_max.R

+      ard_categorical(
+        data = data |>
+          cards:::arrange_using_order(c(id, by, x)) |>
+          dplyr::slice_tail(n = 1L, by = all_of(c(id, intersect(by, names(denominator))))),


This part intersect(by, names(denominator)) is confusing me a bit. What if someone passes an integer as the denom? Then it will return NULL, and as a result, we would have dplyr::slice_tail(n = 1L, by = all_of(id)), but above we sorted by c(id, by, x) so the max value of x could have appeared in the first by level and not been sorted to the bottom. Right?

I am trying to think through the implications of this line.... 🤔 What if some specifies by=c("ARM", "SEX"), but the denominator only has 'SEX'? In the previous step, we've sorted by ID, then ARM, then SEX, the the variable. Then within ID and SEX, we're taking the last observation....Is that what we want? Should it be a requirement that the denom dataset has all the by variables when it is a data frame? (I am really not sure, so I am asking!) 😆

ddsjoberg · 2025-01-11T00:14:48Z

R/ard_categorical_max.R

+        fmt_fn = fmt_fn,
+        stat_label = stat_label
+      ) |>
+        list()


Since we're in an lapply() we don't need to pipe this into list(), right?

ddsjoberg · 2025-01-11T00:15:36Z

R/ard_categorical_max.R

+
+  # print default order of character variable levels ---------------------------
+  for (v in variables) {
+    if (is.character(data[[v]])) {


To be on the safe side, can we print this for all variables?

edelarua added 4 commits November 22, 2024 21:59

Add ard_event_rates

8ea2946

Add tests

83bc584

Add check

783b3ed

Add tests

a717455

edelarua added the sme label Nov 25, 2024

edelarua added 2 commits November 25, 2024 18:36

Fix check

c1803f2

Styler

c023a1a

edelarua added 2 commits January 8, 2025 19:06

Refactor as ard_categorical_max

299a505

Merge branch 'main' into 240_ard_event_rates@main

87bcfce

edelarua changed the title ~~Add ard_event_rates()~~ Add ard_categorical_max() Jan 9, 2025

edelarua added 2 commits January 8, 2025 19:09

Delete old file

05ac1df

Update pkgdown

0bd60b5

ddsjoberg self-requested a review January 9, 2025 19:36

Merge branch 'main' into 240_ard_event_rates@main

21040aa

ddsjoberg requested changes Jan 9, 2025

View reviewed changes

edelarua added 3 commits January 10, 2025 18:03

Update ard_categorical_max

34ce9ac

Merge branch 'main' into 240_ard_event_rates@main

f5b06af

Merge remote-tracking branch 'origin/240_ard_event_rates@main' into 2…

2f14d54

…40_ard_event_rates@main

edelarua requested a review from ddsjoberg January 10, 2025 23:05

ddsjoberg requested changes Jan 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `ard_categorical_max()` #244

Add `ard_categorical_max()` #244

edelarua commented Nov 25, 2024 •

edited

Loading

github-actions bot commented Nov 25, 2024 •

edited

Loading

github-actions bot commented Nov 25, 2024 •

edited

Loading

github-actions bot commented Nov 25, 2024

ddsjoberg commented Nov 25, 2024

edelarua commented Nov 26, 2024

ddsjoberg commented Nov 26, 2024

ddsjoberg commented Nov 26, 2024

ddsjoberg left a comment

ddsjoberg left a comment

ddsjoberg Jan 11, 2025 •

edited

Loading

ddsjoberg Jan 11, 2025

ddsjoberg Jan 11, 2025

ddsjoberg Jan 11, 2025

ddsjoberg Jan 11, 2025

Add ard_categorical_max() #244

Are you sure you want to change the base?

Add ard_categorical_max() #244

Conversation

edelarua commented Nov 25, 2024 • edited Loading

github-actions bot commented Nov 25, 2024 • edited Loading

Unit Tests Summary

github-actions bot commented Nov 25, 2024 • edited Loading

Unit Test Performance Difference

github-actions bot commented Nov 25, 2024

Code Coverage Summary

Diff against main

ddsjoberg commented Nov 25, 2024

edelarua commented Nov 26, 2024

ddsjoberg commented Nov 26, 2024

ddsjoberg commented Nov 26, 2024

ddsjoberg left a comment

Choose a reason for hiding this comment

ddsjoberg left a comment

Choose a reason for hiding this comment

ddsjoberg Jan 11, 2025 • edited Loading

Choose a reason for hiding this comment

ddsjoberg Jan 11, 2025

Choose a reason for hiding this comment

ddsjoberg Jan 11, 2025

Choose a reason for hiding this comment

ddsjoberg Jan 11, 2025

Choose a reason for hiding this comment

ddsjoberg Jan 11, 2025

Choose a reason for hiding this comment

Add `ard_categorical_max()` #244

Add `ard_categorical_max()` #244

edelarua commented Nov 25, 2024 •

edited

Loading

github-actions bot commented Nov 25, 2024 •

edited

Loading

github-actions bot commented Nov 25, 2024 •

edited

Loading

ddsjoberg Jan 11, 2025 •

edited

Loading